Problem Statement¶
Business Context¶
Workplace safety in hazardous environments like construction sites and industrial plants is crucial to prevent accidents and injuries. One of the most important safety measures is ensuring workers wear safety helmets, which protect against head injuries from falling objects and machinery. Non-compliance with helmet regulations increases the risk of serious injuries or fatalities, making effective monitoring essential, especially in large-scale operations where manual oversight is prone to errors and inefficiency.
To overcome these challenges, SafeGuard Corp plans to develop an automated image analysis system capable of detecting whether workers are wearing safety helmets. This system will improve safety enforcement, ensuring compliance and reducing the risk of head injuries. By automating helmet monitoring, SafeGuard aims to enhance efficiency, scalability, and accuracy, ultimately fostering a safer work environment while minimizing human error in safety oversight.
Objective¶
As a data scientist at SafeGuard Corp, you are tasked with developing an image classification model that classifies images into one of two categories:
- With Helmet: Workers wearing safety helmets.
- Without Helmet: Workers not wearing safety helmets.
Data Description¶
The dataset consists of 631 images, equally divided into two categories:
- With Helmet: 311 images showing workers wearing helmets.
- Without Helmet: 320 images showing workers not wearing helmets.
Dataset Characteristics:
- Variations in Conditions: Images include diverse environments such as construction sites, factories, and industrial settings, with variations in lighting, angles, and worker postures to simulate real-world conditions.
- Worker Activities: Workers are depicted in different actions such as standing, using tools, or moving, ensuring robust model learning for various scenarios.
Installing and Importing the Necessary Libraries¶
!pip install tensorflow[and-cuda] numpy==1.25.2 -q
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/10.8 MB ? eta -:--:-- ━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/10.8 MB 31.2 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━ 7.7/10.8 MB 113.2 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 10.8/10.8 MB 201.4 MB/s eta 0:00:01 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 113.6 MB/s eta 0:00:00 Installing build dependencies ... done error: subprocess-exited-with-error × Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> See above for output. note: This error originates from a subprocess, and is likely not a problem with pip. Getting requirements to build wheel ... error error: subprocess-exited-with-error × Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> See above for output. note: This error originates from a subprocess, and is likely not a problem with pip.
import tensorflow as tf
print("Num GPUs Available:", len(tf.config.list_physical_devices('GPU')))
print(tf.__version__)
Num GPUs Available: 1 2.19.0
Note:
After running the above cell, kindly restart the notebook kernel (for Jupyter Notebook) or runtime (for Google Colab) and run all cells sequentially from the next cell.
On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in this notebook.
import os
import random
import numpy as np # Importing numpy for Matrix Operations
import pandas as pd
import seaborn as sns
import matplotlib.image as mpimg # Importing pandas to read CSV files
import matplotlib.pyplot as plt # Importting matplotlib for Plotting and visualizing images
import math # Importing math module to perform mathematical operations
import cv2
# Tensorflow modules
import keras
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator # Importing the ImageDataGenerator for data augmentation
from tensorflow.keras.models import Sequential # Importing the sequential module to define a sequential model
from tensorflow.keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,BatchNormalization # Defining all the layers to build our CNN Model
from tensorflow.keras.optimizers import Adam,SGD # Importing the optimizers which can be used in our model
from sklearn import preprocessing # Importing the preprocessing module to preprocess the data
from sklearn.model_selection import train_test_split # Importing train_test_split function to split the data into train and test
from sklearn.metrics import confusion_matrix
from tensorflow.keras.models import Model
from keras.applications.vgg16 import VGG16 # Importing confusion_matrix to plot the confusion matrix
# Display images using OpenCV
from google.colab.patches import cv2_imshow
#Imports functions for evaluating the performance of machine learning models
from sklearn.metrics import confusion_matrix, f1_score,accuracy_score, recall_score, precision_score, classification_report
from sklearn.metrics import mean_squared_error as mse # Importing cv2_imshow from google.patches to display images
# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
# Set the seed using keras.utils.set_random_seed. This will set:
# 1) `numpy` seed
# 2) backend random seed
# 3) `python` random seed
tf.keras.utils.set_random_seed(812)
Section 1: Data Overview¶
Loading the data¶
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
images_file = '/content/drive/My Drive/Colab/ICV Proj/images_proj.npy'
labels_file = '/content/drive/My Drive/Colab/ICV Proj/Labels_proj.csv'
images = np.load(images_file)
labels_df = pd.read_csv(labels_file)
labels = labels_df['Label'].values
print("Shape of images array:", images.shape)
print("Shape of labels array:", labels.shape)
Shape of images array: (631, 200, 200, 3) Shape of labels array: (631,)
Section 2: Exploratory Data Analysis¶
Plot random images from each of the classes and print their corresponding labels.¶
with_helmet_indices = np.where(labels == 1)[0]
without_helmet_indices = np.where(labels == 0)[0]
random_with_helmet_indices = np.random.choice(with_helmet_indices, 5, replace=False)
random_without_helmet_indices = np.random.choice(without_helmet_indices, 5, replace=False)
fig, axes = plt.subplots(2, 5, figsize=(15, 6))
fig.suptitle('Sample Images from Dataset', fontsize=16)
for i, idx in enumerate(random_with_helmet_indices):
axes[0, i].imshow(images[idx])
axes[0, i].set_title('With Helmet')
axes[0, i].axis('off')
for i, idx in enumerate(random_without_helmet_indices):
axes[1, i].imshow(images[idx])
axes[1, i].set_title('Without Helmet')
axes[1, i].axis('off')
plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()
Checking for class imbalance¶
with_helmet_count = np.sum(labels == 1)
without_helmet_count = np.sum(labels == 0)
print(f"With Helmet: {with_helmet_count}")
print(f"Without Helmet: {without_helmet_count}")
plt.figure(figsize=(8, 6))
sns.countplot(x=labels)
plt.title('Distribution of Helmet Classes')
plt.xticks([0, 1], ['Without Helmet', 'With Helmet'])
plt.xlabel('Class')
plt.ylabel('Count')
plt.show()
With Helmet: 311 Without Helmet: 320
Class Balance: The dataset is very well-balanced. This is excellent and means we don't need special techniques to handle imbalance.
Image Variety: The "With Helmet" images show good variety in lighting, angles, and backgrounds.
"Without Helmet" Images: The "Without Helmet" images appear to be mostly close-ups of faces.
Color Issue (A quick note): The sample images all look blue. This is because
cv2loads images inBGRformat, but matplotlib displays them in RGB format.In the next step we will convert all images to grayscale, which will solve this color-channel problem.
Section 3: Data Preprocessing¶
Converting images to grayscale¶
grayscale_images = []
# convert them to grayscale
for img in images:
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
grayscale_images.append(gray_img)
grayscale_images = np.array(grayscale_images)
print("Shape of grayscale images array:", grayscale_images.shape)
# Plot an original BGR image and its grayscale version
plt.figure(figsize=(10, 5))
# Plot original BGR image
plt.subplot(1, 2, 1)
plt.imshow(images[0]) # Display the first image (BGR format, will appear blue)
plt.title('Original BGR Image')
plt.axis('off')
# Plot grayscale image
plt.subplot(1, 2, 2)
plt.imshow(grayscale_images[0], cmap='gray') # Display the grayscale version with 'gray' colormap
plt.title('Grayscale Image')
plt.axis('off')
plt.show()
Shape of grayscale images array: (631, 200, 200)
Why Grayscale images?¶
Simpler Data: The model only needs to learn from pixel brightness (a value from 0-255). It doesn't need to worry about complex color information (like "is a blue object a helmet?" or "is a yellow object a helmet?"). For this problem, the shape of the helmet is much more important than its color.
Faster Training and Processing: With reduced Channels, the data size for each image is now 1/3 of what it was (it's
(200, 200, 1)instead of(200, 200, 3)), which makes the model train much faster.
Splitting the dataset¶
Standard data splitting into Train, Validation, and Test Sets
- 80/20 Training vs Santiy Check ratio
stratify=labelsforh 50/50 with/without helmet balance - 50/50 Test and Validation set split to set aside a Test set.
# Split the data into training (80%) and temporary (20%) sets
X_train, X_temp, y_train, y_temp = train_test_split(
grayscale_images, labels, test_size=0.2, stratify=labels, random_state=42
)
# Split the temporary set into validation (50%) and test (50%) sets
X_val, X_test, y_val, y_test = train_test_split(
X_temp, y_temp, test_size=0.5, stratify=y_temp, random_state=42
)
print("Shape of X_train:", X_train.shape)
print("Shape of y_train:", y_train.shape)
print("Shape of X_val:", X_val.shape)
print("Shape of y_val:", y_val.shape)
print("Shape of X_test:", X_test.shape)
print("Shape of y_test:", y_test.shape)
Shape of X_train: (504, 200, 200) Shape of y_train: (504,) Shape of X_val: (63, 200, 200) Shape of y_val: (63,) Shape of X_test: (64, 200, 200) Shape of y_test: (64,)
Data Normalization¶
# Convert to float32 and normalize pixel values
X_train = X_train.astype('float32') / 255.0
X_val = X_val.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0
# Reshape to add channel dimension (for grayscale images, the channel is 1)
X_train = np.expand_dims(X_train, axis=-1)
X_val = np.expand_dims(X_val, axis=-1)
X_test = np.expand_dims(X_test, axis=-1)
print("New shape of X_train:", X_train.shape)
New shape of X_train: (504, 200, 200, 1)
Section 4: Model Building¶
Model Evaluation Criterion¶
Utility Functions¶
# defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification(model, predictors, target):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
"""
# checking which probabilities are greater than threshold
pred = model.predict(predictors).reshape(-1)>0.5
target = target.to_numpy().reshape(-1)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred, average='weighted') # to compute Recall
precision = precision_score(target, pred, average='weighted') # to compute Precision
f1 = f1_score(target, pred, average='weighted') # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame({"Accuracy": acc, "Recall": recall, "Precision": precision, "F1 Score": f1,},index=[0],)
return df_perf
def plot_confusion_matrix(model,predictors,target,ml=False):
"""
Function to plot the confusion matrix
model: classifier
predictors: independent variables
target: dependent variable
ml: To specify if the model used is an sklearn ML model or not (True means ML model)
"""
# checking which probabilities are greater than threshold
pred = model.predict(predictors).reshape(-1)>0.5
target = target.to_numpy().reshape(-1)
# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(target,pred)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
confusion_matrix,
annot=True,
linewidths=.4,
fmt="d",
square=True,
ax=ax
)
plt.show()
Model 1: Simple Convolutional Neural Network (CNN)¶
# Define the simple CNN model
model1 = Sequential([
Conv2D(32, (3, 3), activation='relu', input_shape=(200, 200, 1)),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),
Flatten(),
Dense(64, activation='relu'),
Dense(1, activation='sigmoid') # Output layer for binary classification
])
# Compile the model
model1.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# Print the model summary
model1.summary()
# Train the model
history1 = model1.fit(X_train, y_train,
epochs=10,
validation_data=(X_val, y_val))
# Plot training history (accuracy and loss)
plt.figure(figsize=(12, 4))
# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(history1.history['accuracy'], label='Train Accuracy')
plt.plot(history1.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model 1 Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history1.history['loss'], label='Train Loss')
plt.plot(history1.history['val_loss'], label='Validation Loss')
plt.title('Model 1 Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d (Conv2D) │ (None, 198, 198, 32) │ 320 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d (MaxPooling2D) │ (None, 99, 99, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_1 (Conv2D) │ (None, 97, 97, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_1 (MaxPooling2D) │ (None, 48, 48, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten (Flatten) │ (None, 147456) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 64) │ 9,437,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 9,456,129 (36.07 MB)
Trainable params: 9,456,129 (36.07 MB)
Non-trainable params: 0 (0.00 B)
Epoch 1/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 11s 331ms/step - accuracy: 0.5303 - loss: 2.0617 - val_accuracy: 0.9683 - val_loss: 0.4330 Epoch 2/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 39ms/step - accuracy: 0.8935 - loss: 0.3723 - val_accuracy: 0.9683 - val_loss: 0.3752 Epoch 3/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 38ms/step - accuracy: 0.9762 - loss: 0.3376 - val_accuracy: 0.9683 - val_loss: 0.3697 Epoch 4/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 41ms/step - accuracy: 0.9748 - loss: 0.3471 - val_accuracy: 0.9683 - val_loss: 0.2977 Epoch 5/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 39ms/step - accuracy: 0.9879 - loss: 0.1288 - val_accuracy: 0.9683 - val_loss: 0.1418 Epoch 6/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 35ms/step - accuracy: 0.9922 - loss: 0.0230 - val_accuracy: 0.9683 - val_loss: 0.1278 Epoch 7/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 35ms/step - accuracy: 0.9950 - loss: 0.0222 - val_accuracy: 1.0000 - val_loss: 0.0028 Epoch 8/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 35ms/step - accuracy: 1.0000 - loss: 0.0012 - val_accuracy: 0.9683 - val_loss: 0.0386 Epoch 9/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 37ms/step - accuracy: 1.0000 - loss: 5.9418e-04 - val_accuracy: 1.0000 - val_loss: 0.0049 Epoch 10/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 35ms/step - accuracy: 1.0000 - loss: 2.2029e-04 - val_accuracy: 1.0000 - val_loss: 0.0068
Model 1 (basic CNN) Observations¶
Clear Overfitting: The Train Accuracy (the blue line) shot up to 100% very quickly (by epoch 8). This means that the model is memorizing the 504 training images.
Unstable Validation: The validation accuracy (the orange line) jumps from 96.8% up to 100%, then back down to 96.8%, and then back up. The Validation Loss is also bouncing around in the same way.
Suspiciously High Performing: The model performs surprisingly well for a simple CNN. It achieves 100% accuracy on the training data and 100% acuracy on the validation data by the end of 10 epochs.
Conclusion: while the final scores are high, the model is definitely overfitting and unstable with noticable signs. It's not learning the general idea of a helmet very well; and is most likely memorizing the training data. We can improve this by using more powerful, pre-trained models.
Vizualizing the predictions¶
# Get predictions for the test set
y_pred_prob1 = model1.predict(X_test)
y_pred1 = (y_pred_prob1 > 0.5).astype(int).flatten()
# Get 10 random indices from the test set
random_indices = np.random.choice(len(X_test), 10, replace=False)
# Visualize the predictions
plt.figure(figsize=(15, 8))
for i, idx in enumerate(random_indices):
plt.subplot(2, 5, i + 1)
plt.imshow(X_test[idx].squeeze(), cmap='gray') # Use squeeze() to remove the channel dimension for display
plt.title(f"True: {y_test[idx]}\nPred: {y_pred1[idx]}")
plt.axis('off')
plt.tight_layout()
plt.show()
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
The Model 1 seems to really perform at the 96%-100% accuracy even from the predictions. It seems to have learned the task of differentiating the images. The visutal difference between a close up face and a worker outside might be easy for the simple CNN to solve.
- close-up face is 0
- complex outdoor construction scene is 1
This might be a very easy pattern to learn which allows the simple model to solve almost perfectly with just 10 epochs.
Model 2: (VGG-16 (Base))¶
# Prepare data for VGG-16 (convert grayscale to RGB)
X_train_rgb = tf.image.grayscale_to_rgb(tf.constant(X_train))
X_val_rgb = tf.image.grayscale_to_rgb(tf.constant(X_val))
X_test_rgb = tf.image.grayscale_to_rgb(tf.constant(X_test))
# Load VGG-16 base model
vgg_base = VGG16(weights='imagenet',
include_top=False,
input_shape=(200, 200, 3))
# Freeze the VGG-16 base layers
vgg_base.trainable = False
# Define the model
model2 = Sequential([
vgg_base,
Flatten(),
Dense(1, activation='sigmoid') # Output layer for binary classification
])
# Compile the model
model2.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# Print the model summary
model2.summary()
# Train the model
history2 = model2.fit(X_train_rgb, y_train,
epochs=10,
validation_data=(X_val_rgb, y_val))
# Plot training history (accuracy and loss)
plt.figure(figsize=(12, 4))
# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(history2.history['accuracy'], label='Train Accuracy')
plt.plot(history2.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model 2 Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history2.history['loss'], label='Train Loss')
plt.plot(history2.history['val_loss'], label='Validation Loss')
plt.title('Model 2 Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5 58889256/58889256 ━━━━━━━━━━━━━━━━━━━━ 1s 0us/step
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ vgg16 (Functional) │ (None, 6, 6, 512) │ 14,714,688 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_1 (Flatten) │ (None, 18432) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 1) │ 18,433 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 14,733,121 (56.20 MB)
Trainable params: 18,433 (72.00 KB)
Non-trainable params: 14,714,688 (56.13 MB)
Epoch 1/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 35s 1s/step - accuracy: 0.9179 - loss: 0.3265 - val_accuracy: 1.0000 - val_loss: 0.0156 Epoch 2/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 167ms/step - accuracy: 1.0000 - loss: 0.0074 - val_accuracy: 1.0000 - val_loss: 0.0046 Epoch 3/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 165ms/step - accuracy: 1.0000 - loss: 0.0022 - val_accuracy: 1.0000 - val_loss: 0.0035 Epoch 4/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 162ms/step - accuracy: 1.0000 - loss: 0.0018 - val_accuracy: 1.0000 - val_loss: 0.0032 Epoch 5/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 163ms/step - accuracy: 1.0000 - loss: 0.0016 - val_accuracy: 1.0000 - val_loss: 0.0029 Epoch 6/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 163ms/step - accuracy: 1.0000 - loss: 0.0013 - val_accuracy: 1.0000 - val_loss: 0.0027 Epoch 7/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 168ms/step - accuracy: 1.0000 - loss: 0.0012 - val_accuracy: 1.0000 - val_loss: 0.0026 Epoch 8/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 166ms/step - accuracy: 1.0000 - loss: 0.0011 - val_accuracy: 1.0000 - val_loss: 0.0024 Epoch 9/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 164ms/step - accuracy: 1.0000 - loss: 0.0010 - val_accuracy: 1.0000 - val_loss: 0.0023 Epoch 10/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 167ms/step - accuracy: 1.0000 - loss: 9.1518e-04 - val_accuracy: 1.0000 - val_loss: 0.0022
Model 2 (VGG-16) Observations¶
Great Performance: The model is a massive improvement over even Model 1. It achieves 100% accuracy on just the first epoch.
Stable: Unlike model 1, the training and validation loss for Model 2 is perfectly smooth and drops to almost zero immediately. The model seems confident and stable.
Training Efficiency: From the
model.summary()we can see that it only trained on 18,433 parameters (the newDenselayer) and no the whole 14.7 million VGG-16 base. This is much more efficient than training the model from scratch.Transfer Learning: Using VGG-16 features to learn from large "ImageNet" dataset seems to be extremely effective for this problem. The model already knew how to see and recognize edge, shapes, and material then just had to complete the final step to recognize helmets (or at least the difference between the two category of images)
Visualizing the prediction:¶
# Get predictions for the test set using Model 2
y_pred_prob2 = model2.predict(X_test_rgb)
y_pred2 = (y_pred_prob2 > 0.5).astype(int).flatten()
# Get 10 random indices from the test set (using the original X_test indices)
random_indices = np.random.choice(len(X_test_rgb), 10, replace=False)
# Visualize the predictions
plt.figure(figsize=(15, 8))
for i, idx in enumerate(random_indices):
plt.subplot(2, 5, i + 1)
plt.imshow(X_test_rgb[idx]) # Display the RGB image
plt.title(f"True: {y_test[idx]}\nPred: {y_pred2[idx]}")
plt.axis('off')
plt.tight_layout()
plt.show()
2/2 ━━━━━━━━━━━━━━━━━━━━ 1s 199ms/step
- Perfect Performance: The model predicts the unseen test sets 10/10 correct.
- Confirming Stability: This results alongside 100% validation accuracy really shows that the model is extremely stable, confident, and not just lucky.
Model 2 seems to have successfully learned the task.
Model 3: (VGG-16 (Base + FFNN))¶
# Load VGG-16 base model
vgg_base = VGG16(weights='imagenet',
include_top=False,
input_shape=(200, 200, 3))
# Freeze the VGG-16 base layers
vgg_base.trainable = False
# Define the model
model3 = Sequential([
vgg_base,
Flatten(),
Dense(128, activation='relu'), # Added Dense layer
Dropout(0.5), # Added Dropout layer
Dense(1, activation='sigmoid') # Output layer for binary classification
])
# Compile the model
model3.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# Print the model summary
model3.summary()
# Train the model
history3 = model3.fit(X_train_rgb, y_train,
epochs=10,
validation_data=(X_val_rgb, y_val))
# Plot training history (accuracy and loss)
plt.figure(figsize=(12, 4))
# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(history3.history['accuracy'], label='Train Accuracy')
plt.plot(history3.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model 3 Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history3.history['loss'], label='Train Loss')
plt.plot(history3.history['val_loss'], label='Validation Loss')
plt.title('Model 3 Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ vgg16 (Functional) │ (None, 6, 6, 512) │ 14,714,688 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_2 (Flatten) │ (None, 18432) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 128) │ 2,359,424 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_4 (Dense) │ (None, 1) │ 129 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 17,074,241 (65.13 MB)
Trainable params: 2,359,553 (9.00 MB)
Non-trainable params: 14,714,688 (56.13 MB)
Epoch 1/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 10s 406ms/step - accuracy: 0.7811 - loss: 0.6541 - val_accuracy: 1.0000 - val_loss: 1.3229e-04 Epoch 2/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 162ms/step - accuracy: 1.0000 - loss: 0.0010 - val_accuracy: 1.0000 - val_loss: 2.9653e-04 Epoch 3/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 163ms/step - accuracy: 0.9961 - loss: 0.0056 - val_accuracy: 1.0000 - val_loss: 3.4198e-05 Epoch 4/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 163ms/step - accuracy: 1.0000 - loss: 0.0013 - val_accuracy: 1.0000 - val_loss: 3.7140e-05 Epoch 5/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 164ms/step - accuracy: 1.0000 - loss: 3.9232e-04 - val_accuracy: 1.0000 - val_loss: 4.0409e-05 Epoch 6/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 190ms/step - accuracy: 1.0000 - loss: 1.9670e-04 - val_accuracy: 1.0000 - val_loss: 4.3577e-05 Epoch 7/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 163ms/step - accuracy: 1.0000 - loss: 3.0124e-04 - val_accuracy: 1.0000 - val_loss: 1.8757e-05 Epoch 8/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 163ms/step - accuracy: 1.0000 - loss: 3.8243e-04 - val_accuracy: 1.0000 - val_loss: 1.3402e-05 Epoch 9/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 164ms/step - accuracy: 1.0000 - loss: 2.5268e-04 - val_accuracy: 1.0000 - val_loss: 1.1776e-05 Epoch 10/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 187ms/step - accuracy: 1.0000 - loss: 1.3585e-04 - val_accuracy: 1.0000 - val_loss: 1.1005e-05
Model 3 (VGG-16 + FFNN) Observations¶
- Perfect Performance (again): just like Model 2, this model got 100% validation accuracy and quickly reached 100% training accuracy.
- Slightly Slower Start (to note): The Train Accuracy (blue line) started low on epoch 0, while the validation Accuracy (orange line) was already at 100%. This should be a normal behavior and should just mean that the randomly initialized
Dense(128)layer just needed one epoch to learn that task first. - More Complex and Unclear Benefits: from
model.summary()- Model 2 (VGG-16) had 18,433 trainable parameters.
- Model 3 (VGG-16 + FFNN) has 2,359,553 trainable parameters.
- Conslusion: This model is much larager and more computationally "expensive" than Model 2, but the improvement of its accuracy is unclear from the current testing data.
Visualizing the predictions¶
# Get predictions for the test set using Model 3
y_pred_prob3 = model3.predict(X_test_rgb)
y_pred3 = (y_pred_prob3 > 0.5).astype(int).flatten()
# Get 10 random indices from the test set (using the original X_test indices)
random_indices = np.random.choice(len(X_test_rgb), 10, replace=False)
# Visualize the predictions
plt.figure(figsize=(15, 8))
for i, idx in enumerate(random_indices):
plt.subplot(2, 5, i + 1)
plt.imshow(X_test_rgb[idx]) # Display the RGB image
plt.title(f"True: {y_test[idx]}\nPred: {y_pred3[idx]}")
plt.axis('off')
plt.tight_layout()
plt.show()
2/2 ━━━━━━━━━━━━━━━━━━━━ 1s 173ms/step
Both Model 2 and Model 3 has solved the problem with 100% accuracy. The extra layer in Model 3 didn't really add any performance benefit, mostly because Model 2 was alaready perfectly handling the task.
Model 4: (VGG-16 (Base + FFNN + Data Augmentation)¶
In most of the real-world case studies, it is challenging to acquire a large number of images and then train CNNs.
To overcome this problem, one approach we might consider is Data Augmentation.
CNNs have the property of translational invariance, which means they can recognise an object even if its appearance shifts translationally in some way. - Taking this attribute into account, we can augment the images using the techniques listed below
- Horizontal Flip (should be set to True/False)
- Vertical Flip (should be set to True/False)
- Height Shift (should be between 0 and 1)
- Width Shift (should be between 0 and 1)
- Rotation (should be between 0 and 180)
- Shear (should be between 0 and 1)
- Zoom (should be between 0 and 1) etc.
Remember, data augmentation should not be used in the validation/test data set.
# Define data augmentation layers
data_augmentation = Sequential([
tf.keras.layers.RandomFlip('horizontal'),
tf.keras.layers.RandomRotation(0.1),
tf.keras.layers.RandomZoom(0.1),
])
# Load VGG-16 base model (same as Model 3)
vgg_base = VGG16(weights='imagenet',
include_top=False,
input_shape=(200, 200, 3))
# Freeze the VGG-16 base layers (same as Model 3)
vgg_base.trainable = False
# Define the model with data augmentation
model4 = Sequential([
data_augmentation, # Add data augmentation as the first layer
vgg_base,
Flatten(),
Dense(128, activation='relu'),
Dropout(0.5),
Dense(1, activation='sigmoid')
])
# Compile the model
model4.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# Print the model summary
model4.summary()
# Train the model
history4 = model4.fit(X_train_rgb, y_train,
epochs=10,
validation_data=(X_val_rgb, y_val))
# Plot training history (accuracy and loss)
plt.figure(figsize=(12, 4))
# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(history4.history['accuracy'], label='Train Accuracy')
plt.plot(history4.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model 4 Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history4.history['loss'], label='Train Loss')
plt.plot(history4.history['val_loss'], label='Validation Loss')
plt.title('Model 4 Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
Model: "sequential_4"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ sequential_3 (Sequential) │ ? │ 0 (unbuilt) │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ vgg16 (Functional) │ (None, 6, 6, 512) │ 14,714,688 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_3 (Flatten) │ ? │ 0 (unbuilt) │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_5 (Dense) │ ? │ 0 (unbuilt) │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_1 (Dropout) │ ? │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_6 (Dense) │ ? │ 0 (unbuilt) │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 14,714,688 (56.13 MB)
Trainable params: 0 (0.00 B)
Non-trainable params: 14,714,688 (56.13 MB)
Epoch 1/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 14s 447ms/step - accuracy: 0.8067 - loss: 0.8281 - val_accuracy: 1.0000 - val_loss: 4.9737e-04 Epoch 2/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 2s 141ms/step - accuracy: 0.9955 - loss: 0.0063 - val_accuracy: 1.0000 - val_loss: 9.4640e-05 Epoch 3/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 3s 170ms/step - accuracy: 0.9937 - loss: 0.0108 - val_accuracy: 1.0000 - val_loss: 8.8581e-05 Epoch 4/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 2s 146ms/step - accuracy: 0.9996 - loss: 0.0016 - val_accuracy: 1.0000 - val_loss: 9.2525e-05 Epoch 5/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 2s 141ms/step - accuracy: 0.9989 - loss: 0.0050 - val_accuracy: 1.0000 - val_loss: 1.2480e-04 Epoch 6/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 2s 129ms/step - accuracy: 0.9987 - loss: 0.0093 - val_accuracy: 1.0000 - val_loss: 2.1831e-04 Epoch 7/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 2s 128ms/step - accuracy: 1.0000 - loss: 0.0046 - val_accuracy: 1.0000 - val_loss: 2.5385e-04 Epoch 8/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 2s 129ms/step - accuracy: 0.9991 - loss: 0.0035 - val_accuracy: 1.0000 - val_loss: 1.0010e-04 Epoch 9/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 2s 136ms/step - accuracy: 1.0000 - loss: 0.0021 - val_accuracy: 1.0000 - val_loss: 6.4038e-05 Epoch 10/10 16/16 ━━━━━━━━━━━━━━━━━━━━ 2s 132ms/step - accuracy: 0.9996 - loss: 0.0018 - val_accuracy: 1.0000 - val_loss: 6.1361e-05
Model 4 (VGG-16 + FFNN + Data Augmentation) Observations¶
This is the most interesting result and shows why data augmentation can help validate our Models better.
- Perfect Validation Score: just like Models 2 and 3, this model got 100% validation accuracy and still is a "perfect" model on the clean validation data.
- Data Augmentation: You can see the Train Accuracy (blue line) bouncy a bit. This is a sign that the augmentation is doing its job. The model isn't just seeing the same 504 training images but randomly seeing them flip, zoom, and rotate, which makes the task harder. The fluctuation in training accuracy proves that the model is being challenged to learn the task of recognizing and not just memorize the images.
- Most Robust Model so far:
- Models 2 and 3 proved that they could get 100% by memorizing the training data
- Model 4 proved that it can get 100% on the validation data while being fed a messy set of training data.
- Model 4 has shown that it learned the general idea of a helmet, not just specific training pictures. It is a robust model that is more likely to work in the real world.
Visualizing the predictions¶
# Get predictions for the test set using Model 4
y_pred_prob4 = model4.predict(X_test_rgb)
y_pred4 = (y_pred_prob4 > 0.5).astype(int).flatten()
# Get 10 random indices from the test set (using the original X_test indices)
random_indices = np.random.choice(len(X_test_rgb), 10, replace=False)
# Visualize the predictions
plt.figure(figsize=(15, 8))
for i, idx in enumerate(random_indices):
plt.subplot(2, 5, i + 1)
plt.imshow(X_test_rgb[idx]) # Display the RGB image
plt.title(f"True: {y_test[idx]}\nPred: {y_pred4[idx]}")
plt.axis('off')
plt.tight_layout()
plt.show()
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 146ms/step
Section 5: Model Performance Comparison and Final Model Selection¶
# Get the final validation accuracy and loss for each model
model1_val_accuracy = history1.history['val_accuracy'][-1]
model1_val_loss = history1.history['val_loss'][-1]
model2_val_accuracy = history2.history['val_accuracy'][-1]
model2_val_loss = history2.history['val_loss'][-1]
model3_val_accuracy = history3.history['val_accuracy'][-1]
model3_val_loss = history3.history['val_loss'][-1]
model4_val_accuracy = history4.history['val_accuracy'][-1]
model4_val_loss = history4.history['val_loss'][-1]
# Create a dictionary with the performance metrics
performance_data = {
'Validation Accuracy': [model1_val_accuracy, model2_val_accuracy, model3_val_accuracy, model4_val_accuracy],
'Validation Loss': [model1_val_loss, model2_val_loss, model3_val_loss, model4_val_loss]
}
# Create a pandas DataFrame
performance_df = pd.DataFrame(performance_data, index=['Simple CNN', 'VGG-16 Base', 'VGG-16 + FFNN', 'VGG-16 + Augmentation'])
# Display the DataFrame
display(performance_df)
| Validation Accuracy | Validation Loss | |
|---|---|---|
| Simple CNN | 1.0 | 0.006788 |
| VGG-16 Base | 1.0 | 0.002180 |
| VGG-16 + FFNN | 1.0 | 0.000011 |
| VGG-16 + Augmentation | 1.0 | 0.000061 |
Model Comparison¶
- Accuracy: All four models achieved 100% validation accuracy, showing this problem was solvable by all four models.
- Loss (The Key metrics): The validation loss tells the real story. It shows how confident the model is.
- Model 1 (Simple CNN):
0.06788shows highest loss and least confidence. - Model 4 (VGG + Augmentation):
0.000061which is slighly lower than Model 3, but shows how confident it can be despite messier data it had to train with.
- Model 1 (Simple CNN):
Best Model Selection: In conclusion we will choose Model 4 (VGG-16 + Augmentation). Model 4 is the most robust because it achieved this perfect score while being trained on augmented data, making it the most reliable choice for the real world.
Test Performance¶
# Evaluate Model 4 on the test set
loss, accuracy = model4.evaluate(X_test_rgb, y_test, verbose=0)
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")
# Generate predictions for the test set
y_pred_prob4 = model4.predict(X_test_rgb)
y_pred4 = (y_pred_prob4 > 0.5).astype(int).flatten()
# Generate and plot the confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred4)
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
xticklabels=['Without Helmet', 'With Helmet'],
yticklabels=['Without Helmet', 'With Helmet'])
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix - Model 4')
plt.show()
# Print the classification report
print("\nClassification Report - Model 4:")
print(classification_report(y_test, y_pred4, target_names=['Without Helmet', 'With Helmet']))
Test Loss: 0.0000 Test Accuracy: 1.0000 2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 118ms/step
Classification Report - Model 4:
precision recall f1-score support
Without Helmet 1.00 1.00 1.00 32
With Helmet 1.00 1.00 1.00 32
accuracy 1.00 64
macro avg 1.00 1.00 1.00 64
weighted avg 1.00 1.00 1.00 64
Final Perforamcne Test¶
- Test Accuracy: The model got 100% on the final test set.
- Confusion Matrix: the confusion matrix shows this visually.
- Correctly identified all 32 "Without Helmet" images.
- Correctly identified all 32 "With Helmet" images.
- It made zero mistakes.
Classificiation Report: This confirms the perfect score again. The Precision and Recall are 1.00, meanming the model is trustworthy and thorough.
Section 6: Test model with own no-helmet images¶
The Problem: A Misleading 100% Score¶
All four of our models achieved 100% (or near 100%) accuracy on the original test data. This looks like a perfect result, but we have a strong reason to believe it's misleading.
There appears to be a major flaw in the original dataset:
- "Without Helmet" images: Are almost all close-up portraits of faces.
- "With Helmet" images: Are all full-body, outdoor, or industrial scenes.
Our Hypothesis¶
The models did not actually learn to identify helmets. They likely learned a simple "cheat":
- If the image is a portrait, predict "Without Helmet."
- If the image is an outdoor scene, predict "With Helmet."
The Goal of This Test¶
The goal of this section is to prove our hypothesis. We will load 5 new test images of workers in outdoor scenes who are not wearing helmets.
If our hypothesis is correct, all our models will see the "outdoor scene" and incorrectly predict "With Helmet" (Label 1). This will confirm that the model is flawed and not ready for real-world use.
Let's call these images our "BYO-no-helmet" data
# Define the directory containing the new "without helmet" images
new_test_dir = '/content/drive/My Drive/Colab/ICV Proj/new_test_images' # Assuming they are in a subfolder
# Initialize a list to store the new test images
new_test_images = []
new_test_labels = [] # Since they are all "without helmet", the label will be 0
# Define the target image size (should match the training images)
image_size = (200, 200) # Assuming the original training images were 200x200
# Load the new test images
for i in range(1, 6): # Assuming filenames are "without-helmet-1.png" to "without-helmet-5.png"
filename = f'without-helmet-{i}.png'
img_path = os.path.join(new_test_dir, filename)
img = cv2.imread(img_path)
if img is not None:
# Resize and convert to RGB (if needed, depending on the model)
# Assuming Model 4 (best model) expects RGB input
img_resized = cv2.resize(img, image_size)
# If the loaded image is grayscale, convert to RGB. If it's already RGB, this does nothing.
if len(img_resized.shape) == 2:
img_resized = cv2.cvtColor(img_resized, cv2.COLOR_GRAY2RGB)
elif img_resized.shape[2] == 4: # Handle potential alpha channel
img_resized = cv2.cvtColor(img_resized, cv2.COLOR_BGRA2RGB)
else: # Assume BGR
img_resized = cv2.cvtColor(img_resized, cv2.COLOR_BGR2RGB)
new_test_images.append(img_resized)
new_test_labels.append(0) # Label 0 for "without helmet"
else:
print(f"Warning: Could not load image {filename}")
# Convert lists to NumPy arrays
new_test_images = np.array(new_test_images)
new_test_labels = np.array(new_test_labels)
print("Shape of new test images array:", new_test_images.shape)
print("Shape of new test labels array:", new_test_labels.shape)
# Convert to grayscale and normalize for Model 1
new_test_images_gray = []
for img in new_test_images:
gray_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
new_test_images_gray.append(gray_img)
new_test_images_gray = np.array(new_test_images_gray).astype('float32') / 255.0
new_test_images_gray = np.expand_dims(new_test_images_gray, axis=-1)
# Normalize for RGB models (Models 2, 3, and 4)
new_test_images_processed = new_test_images.astype('float32') / 255.0
Shape of new test images array: (5, 200, 200, 3) Shape of new test labels array: (5,)
# Evaluate Model 4 on the new test images
# Convert grayscale images to RGB format for Model 4 (VGG16 expects 3 channels)
new_test_images_rgb_for_model4 = tf.image.grayscale_to_rgb(tf.constant(new_test_images_gray))
loss, accuracy = model4.evaluate(new_test_images_rgb_for_model4, new_test_labels, verbose=0)
print(f"Performance on new test images:")
print(f" Loss: {loss:.4f}")
print(f" Accuracy: {accuracy:.4f}")
# Generate predictions for the new test images
# Using RGB images for Model 4 now
new_test_pred_prob = model4.predict(new_test_images_rgb_for_model4)
new_test_pred = (new_test_pred_prob > 0.5).astype(int).flatten()
# Print the predictions
print("\nPredictions for new test images:")
for i, pred in enumerate(new_test_pred):
print(f"Image {i+1}: Predicted Label = {pred} (True Label = {new_test_labels[i]})")
Performance on new test images: Loss: 19.0860 Accuracy: 0.0000 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step Predictions for new test images: Image 1: Predicted Label = 1 (True Label = 0) Image 2: Predicted Label = 1 (True Label = 0) Image 3: Predicted Label = 1 (True Label = 0) Image 4: Predicted Label = 1 (True Label = 0) Image 5: Predicted Label = 1 (True Label = 0)
plt.figure(figsize=(10, 4))
for i in range(len(new_test_images)):
plt.subplot(1, len(new_test_images), i + 1)
plt.imshow(new_test_images_gray[i].squeeze(), cmap='gray')
plt.title(f"True: {new_test_labels[i]}\nPred: {new_test_pred[i]}")
plt.axis('off')
plt.tight_layout()
plt.show()
Observation of Test BYO-no-helmet¶
- Compeletely fails the task: Our best performing model fails to identify the helmet so this confirms that the model is being trained on identifying other differences between the helmet and no helmet dataset.
- Definition of "Good Data": The two datasets need to ideally be exactly the same besides the missing helmet for the model to train on so that Model doesn't pick up on other patterns between the two classifications of the datasets.
Testing BYO-no-helmet on Model 1, 2, and 3 for Observations¶
# --- Evaluate Model 1 ---
print("--- Model 1 Performance on new test images ---")
# Model 1 expects grayscale input with channel dimension, which new_test_images_gray is
loss1, accuracy1 = model1.evaluate(new_test_images_gray, new_test_labels, verbose=0)
print(f" Loss: {loss1:.4f}")
print(f" Accuracy: {accuracy1:.4f}")
new_test_pred_prob1 = model1.predict(new_test_images_gray)
new_test_pred1 = (new_test_pred_prob1 > 0.5).astype(int).flatten()
print("Predictions for new test images (Model 1):")
for i, pred in enumerate(new_test_pred1):
print(f"Image {i+1}: Predicted Label = {pred} (True Label = {new_test_labels[i]})")
# --- Evaluate Model 2 ---
print("\n--- Model 2 Performance on new test images ---")
# Model 2 (VGG16 base) expects 3-channel input. Convert grayscale to RGB.
new_test_images_rgb_for_model2 = tf.image.grayscale_to_rgb(tf.constant(new_test_images_gray))
loss2, accuracy2 = model2.evaluate(new_test_images_rgb_for_model2, new_test_labels, verbose=0)
print(f" Loss: {loss2:.4f}")
print(f" Accuracy: {accuracy2:.4f}")
new_test_pred_prob2 = model2.predict(new_test_images_rgb_for_model2)
new_test_pred2 = (new_test_pred_prob2 > 0.5).astype(int).flatten()
print("Predictions for new test images (Model 2):")
for i, pred in enumerate(new_test_pred2):
print(f"Image {i+1}: Predicted Label = {pred} (True Label = {new_test_labels[i]})")
# --- Evaluate Model 3 ---
print("\n--- Model 3 Performance on new test images ---")
# Model 3 (VGG16 base + FFNN) expects 3-channel input. Convert grayscale to RGB.
new_test_images_rgb_for_model3 = tf.image.grayscale_to_rgb(tf.constant(new_test_images_gray))
loss3, accuracy3 = model3.evaluate(new_test_images_rgb_for_model3, new_test_labels, verbose=0)
print(f" Loss: {loss3:.4f}")
print(f" Accuracy: {accuracy3:.4f}")
new_test_pred_prob3 = model3.predict(new_test_images_rgb_for_model3)
new_test_pred3 = (new_test_pred_prob3 > 0.5).astype(int).flatten()
print("Predictions for new test images (Model 3):")
for i, pred in enumerate(new_test_pred3):
print(f"Image {i+1}: Predicted Label = {pred} (True Label = {new_test_labels[i]})")
--- Model 1 Performance on new test images --- Loss: 81.0079 Accuracy: 0.0000 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 40ms/step Predictions for new test images (Model 1): Image 1: Predicted Label = 1 (True Label = 0) Image 2: Predicted Label = 1 (True Label = 0) Image 3: Predicted Label = 1 (True Label = 0) Image 4: Predicted Label = 1 (True Label = 0) Image 5: Predicted Label = 1 (True Label = 0) --- Model 2 Performance on new test images --- Loss: 11.1684 Accuracy: 0.0000 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 77ms/step Predictions for new test images (Model 2): Image 1: Predicted Label = 1 (True Label = 0) Image 2: Predicted Label = 1 (True Label = 0) Image 3: Predicted Label = 1 (True Label = 0) Image 4: Predicted Label = 1 (True Label = 0) Image 5: Predicted Label = 1 (True Label = 0) --- Model 3 Performance on new test images --- Loss: 25.4114 Accuracy: 0.0000 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step Predictions for new test images (Model 3): Image 1: Predicted Label = 1 (True Label = 0) Image 2: Predicted Label = 1 (True Label = 0) Image 3: Predicted Label = 1 (True Label = 0) Image 4: Predicted Label = 1 (True Label = 0) Image 5: Predicted Label = 1 (True Label = 0)
You can see that all three models fails this task as well.
Section 7: Actionable Insights & Recommendations¶
Actionable Insights¶
- Model learned the Task (a bit too easily): Automated helmet detection is 100% feasible with this dataset. The visual difference between a close-up face (class 0) and a worker on-site (Class 1) is very clear, which is why the models performed erfectly on this data.
- Best Model: The VGG-16 model with Data Augmentation, Model 4, is the best choice. It is the most robust and confident which makes it ideal for real world use.
- Transfer Learning Works well for this task: Transfer Learning (using VGG-16) was far more stable and effective compared to building a CNN from scratch (Model 1).
- Critical Dataset Flaw: The model's 100% accuracy is highly misleading. There is a spurious correlation in the dataset. It's very likely the model is learning this pattern and not the features of a helmet being present:
- "close-up portraits" = the "Without Helmet" class
- "full-body outdoor scenes" = the "With Helmet" class
Business Recommendations¶
- Hold off on Pilot Program: This model may fail in the real world. It is likely classifying close-up face as "Without Helmet" and any worker far away as "With Helmet," even if they aren't wearing one.
- Data Rec-collection: The project should collect new, high-quality "Without Helmet" data.
- Also Expand the Dataset: To make the model even better, the company should collect more data, especially of:
- Workers in low-light or dark conditions.
- Workers in bad weather (rain or snow)
- Workers who are far away from the camera.
- Workers wearing other head gear like caps or beanies.
- Redefine "Good Data": The new data must show owrkers on the same contrsuction sites, in same poses and distances, but simply not wearing their helmets. This will force the model to learn the actual feature (the helmet) and ignore the background.
Power Ahead!
!jupyter nbconvert --to html "/content/drive/My Drive/Colab/ICV Proj/HelmNet_Full_Code.ipynb"
[NbConvertApp] Converting notebook /content/drive/My Drive/Colab/ICV Proj/HelmNet_Full_Code.ipynb to html [NbConvertApp] WARNING | Alternative text is missing on 13 image(s). [NbConvertApp] Writing 5831050 bytes to /content/drive/My Drive/Colab/ICV Proj/HelmNet_Full_Code.html